60 research outputs found

    The Semantic Relations of Artifacts in DanNet

    Get PDF
    Proceedings of the NODALIDA 2009 workshop WordNets and other Lexical Semantic Resources — between Lexical Semantics, Lexicography, Terminology and Formal Ontologies. Editors: Bolette Sandford Pedersen, Anna Braasch, Sanni Nimb and Ruth Vatvedt Fjeld. NEALT Proceedings Series, Vol. 7 (2009), 21-26. © 2009 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/9209

    Semantiske relationer i en ny dansk begrebsordbog: genbrug på tværs af ordbøger

    Get PDF
    This article discusses the synergies between thesauri, wordnets andsemasiological dictionaries, describing how the lexical data from awordnet (DanNet), as well as the data from a monolingual dictionary(The Danish Dictionary), are being used to establish a Danishthesaurus. In opposition to other thesauri, this one’s main organizingprinciple is the grouping of words by semantic criteria. It thuscombines words from different parts of speech in different types ofgroups. Furthermore, a number of formalized data on the words’ semanticrelations leave us with a large number of possibilities when itcomes to using thesaurus information in the online presentation ofThe Danish Dictionary. It also means that the thesaurus can be usedto extend the number of semantic relations in DanNet

    Fra ordbog til wordnet. Hvordan udmøntes en traditionel ordbogsdefinition i en formaliseret wordnetbeskrivelse?

    Get PDF
    In this article we discuss the problems involved in transforming traditional dictionary definitions into formalised descriptions meant for computational use. The starting point is the making of the Danish wordnet, DanNet, which is based on a large monolingual dictionary. In the process of extracting semantic information from the entries it turns out that much information is already present in the dictionary text, but a lot of useful information is only implicitly there or presupposed. The article falls into three parts: firstly, the principles of lexicographic definitions are discussed, in particular the ones used in The Danish Dictionary (Den Danske Ordbog). Secondly, it is demonstrated how definitions are “translated” into semantic relations in the wordnet, and examples of how missing information is added are given. Finally, the idea of making general dictionaries and wordnets in one operation is put forward and discussed

    Fra begrebsordbog til sprogteknologisk ressource: verber, semantiske roller og rammer – et pilotstudie

    Get PDF
    This paper describes a method of compiling a lexicon of Danish semantic frames within the model of the Berkeley FrameNet (BFN). Large groups of near-synonymous verbs and verbal nouns, including multiword units, within the domains of communication and cognition are identified and extracted from the source manuscript of a newly published Danish the-saurus. Each word or expression is then assigned an appropriate frame from BFN. The fact that words within the same domain all belong to a manageable subset of frames in BFN makes is possible to map a high number of words to their corresponding frames simultaneously. In a forthcoming annotation project where words within the same two do-mains are already identified in the corpus, the idea is to pre-annotate with the frames in our lexicon, leaving afterwards human annotators to con-firm the frame and test whether it is possible to identify the BFN seman-tic roles described for English in the Danish text. Our method reveals some interesting divergences between the semantic divisions established in the thesaurus in contrast to the ones found in BFN, showing that the two resources contribute with different types of linguistic information and thereby constitute a useful supplement to one another

    At have begreb skabt om noget – om Den Danske Begrebsordbog

    Get PDF
    This paper describes the process of compiling a comprehensive Danish thesaurus at the Society for Danish Language and Literature (DSL) to be published in 2014. The thesaurus is mainly based on the sense descriptions and the lemma selection in our monolingual dictionary, Den Danske Ordbog (DDO). We will focus on the editing process and show how we use digital meth ods to establish annotated semantic groups which are afterwards automatically transferred to word class groups in the printed version of the thesaurus. The index constitutes an important part of the dictionary and will also be discussed, as will the future plans of making use of the thesaurus data, in particular in other dictionaries published by DSL

    Title Pages

    Get PDF
    Proceedings of the NODALIDA 2009 workshop WordNets and other Lexical Semantic Resources — between Lexical Semantics, Lexicography, Terminology and Formal Ontologies. Editors: Bolette Sandford Pedersen, Anna Braasch, Sanni Nimb and Ruth Vatvedt Fjeld. NEALT Proceedings Series, Vol. 7 (2009), i-ii. © 2009 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/9209

    Updating the dictionary: Semantic change identification based on change in bigrams over time

    Get PDF
    We investigate a method of updating a Danish monolingual dictionary with new semantic information on already included lemmas in a systematic way, based on the hypothesis that the variation in bigrams over time in a corpus might indicate changes in the meaning of one of the words. The method combines corpus statistics with manual annotations. The first step consists in measuring the collocational change in a homogeneous newswire corpus with texts from a 14 year time span, 2005 through 2018, by calculating all the statistically significant bigrams. These are then applied to a new version of the corpus that is split into one sub-corpus per year. We then collect all the bigrams that do not appear at all in the first three years, but appear at least 20 times in the following 11 years. The output, a dataset of 745 bigrams considered to be potentially new in Danish, are double annotated, and depending on the annotations and the inter-annotator agreement, either discarded or divided into groups of relevant data for further investigation. We then carry out a more thorough lexicographical study of the bigrams in order to determine the degree to which they support the identification of new senses and lead to revised sense inventories for at least one of the words Furthermore we study the relation between the revisions carried out, the annotation values and the degree of inter-annotator agreement. Finally, we compare the resulting updates of the dictionary with Cook et al. (2013), and discuss whether the method might lead to a more consistent way of revising and updating the dictionary in the future

    DanNet: udvikling og anvendelse af det danske wordnet

    Get PDF
    A wordnet for Danish is under compilation as a joint project between the Centre forLanguage Technology at the University of Copenhagen and the Society for Danish Languageand Literature. Both partners have recently been involved in the development ofrelevant lexical resources which are utilized as an important part of the current project,most importantly The Danish Dictionary, a corpus-based dictionary of modern Danish,and the international SIMPLE project. This article describes how the existing data arereused to create a much sought-after resource within Danish language technology. Typicalexamples and problems faced during the editing process are presented, with focus onpolysemy, synonymy and semantic classification. Finally, we outline some perspectivesfor lexicographic products aimed at human users. Specifically, the development potentialfor the productive function of dictionaries is discussed, as are ways of improving thepedagogical function of learners’ dictionaries

    Dansk betydningsinventar i et datalingvistisk perspektiv

    Get PDF
    In this paper we investigate the Danish sense inventory from a paradigmatic and a syntagmatic perspective, respectively, and we present a collection of related lexical semantic resources that we have developed in collaboration between The Society for Danish Language and Literature and The University of Copenhagen. The resources comprise a Danish wordnet (DanNet), The Danish FrameNet Lexicon, and The Danish Sentiment Lexicon. All three resources are designed to enable semantic processing to be used in digital humanities research as well as more broadly in language-centric technology development. Finally, in order to illustrate the use of the resources when processing running text, we provide some annotation examples of each resource
    corecore